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This brief, which is filed herewith in triplicate, is in furtherance of the Notice 
of Appeal, filed in this case on October 30, 2003. 

This brief contains these items imder the following headings, and in the order 
set forth below (37 C.F.R. §1.1 92(c)): 

I. Real Party in Interest 

n. Related Appeals and Interferences 

m. Status of Claims 

IV. Status of Amendments 

V. Summary of Invention 
VL Issues 

vn. Grouping of Claims 
vni. Arguments 
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□ Argument VDIA. Rejections Under 35 U.S.C. §1 12, first 

PARAGRAPH 

□ Argument vmB. Rejections Under 35 U.S.C. §112, second 

PARAGRAPH " 

□ Argument vmc. Rejections Under 35 U.S.C. §102 
0 Argument vmD. Rejections Under 35 U.S.C. §103 

□ Argument VUffi. Rejection Other Than 35 U.S.C. §§102, 103 

and 112 

DC. Appendix of Claims Involved in the Appeal 
X. Other Materials that Appellant Considers Necessary or 
Desirable 
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L Real Party in Interest 

The real party in interest in the appeal is; 

□ the party named in the caption of this brief. 
0 the following party: Latemational Business Machines Corp. of 
Armonk, New York 
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n. Related Appeals and Interferences 

With respect to other appeals or interferences that will directly affect, or be 
directly affected by, or have a bearing on the Board's decision in this appeal: 

0 there are no such appeals or interferences. 

□ these are as follows: 
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m. Status of Claims 

The status of the claims in this appUcation are: 

A. Total number of claims in Application 

Claims in the application are: Claims 1 to 6 

B. Status of all the claims: 

1 . Claims cancelled: No claims have been cancelled. 

2. Claims withdrawn from consideration but not cancelled: No claims have 

been withdrawn from consideration but not cancelled. 

3. Claims pending: Claims 1 to 6 are pending. 

4. Claims allowed: No claims are allowed. 

5. Claims rejected: Claims 1 to 6 are rejected. 

B. Claims on Appeal. 

The claims on appeal are: Claims 1 to 6 are on appeal. 
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IV. Status of Amendments 

The status of amendments filed subsequent to the final rejection are as 
follows: 

A proposed amendment under 37 C.F.R. §1.116 was filed on September 22, 
2003. That amendment made an amendment to page 3 of the specification. The 
amendment to page 3 was in response to the Examiner's objection to the specification 
as containing embedded and/or other form of browser-executable code. While the 
Examiner is technically correct that the specification contains what appears to be 
embedded and/or other form of browser-executable code, what is in fact described are 
hypothetical uniform resource locators (URLs) for the purpose of illustrating by 
example the hierarchical structure of Web pages. To avoid confiision with real URLs, 
the URLs '^mw.bank.com/loans", "www.bank.com/loans/auto" and 
"www.bank.com/loans/homemortgage" were enclosed in quotes and a parenthetical 
explanation has been added that these are hypothetical, as opposed to real, URLs for 
the sake of the example being described. No amendments were made to the claims. 
The Advisory Action mailed October 22, 2003, indicated that, for purposes of appeal, 
the proposed amendment would be entered, and it is understood that in so indicating, 
the Examiner has withdrawn his objection to the specification. 
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V. Summary of Invention 

The invention as defined in the claims on appeal is directed to a procedure that 
automates the process of setting up an instance of a conversational natural language 
interface for a Web site. By automating the process of setting up a new Web site, 
anyone can create a new interface. Subsequent manual tuning of the interface is 
possible and much easier to do than creating an interface from scratch. The invention 
solves the problem by bringing together a number of ideas and techniques, some of 
which have been used in natural language processing for other purposes. In order to 
set up an instance of a natural language conversational interface (NLCI), it is 
necessary to 

1) define a hierarchy of topics into which individual documents or Web pages 
can be classified, 

2) provide a keyword index for those documents for an associated search engine, 
and 

3) for each node in the hierarchy, specify a mechanism for associating an input 
natural language (NL) query to the node. (In the preferred embodiment, this 
mechanism is a rule set and associated rule applier.) 

To solve step (1), Applicants noted that the uniform resource locators (URLs) 
of the Web pages associated with a single site are often organized into a coherent 
hierarchy of topics. On reflection, this was not surprising, since good Web design 
encourages logical movement from page to page. Thus, a bank might have a Web 
page with the URL "www.bank.com/loans". It will have links to pages with URLs 
'Vww.bank.com/loans/auto" and "www.bank.com/loans/homemortgage", and so 
forth. (The URLs "www.bank.com/loans", "www.bank.com/loans/auto" and 
"www.bank.com/loans/homemortgage" are hypothetical for this example.) This is 
clearly a topic hierarchy of exactly the kind necessary for establishing the NLCI, in 
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which "loans" is a high level node and "auto" and "homemortgage" are nodes 
subordinate to it. If these are the lowest level in the hierarchy, the Web pages they 
point to are leaves. 

To solve step (2), Applicants use methods from statistical natural language 
processing. From each document, Applicants generate a set of single words, bi-grams, 
etc., up to n-grams for some nxmiber w. However, these are not necessarily sequential 
n-grams. Applicants allow gaps between the words making up the n-gram. The term 
"sparse n-gram" is used at places in the patent application to emphasize the possibility 
that there might be gaps between the words in the n-gram. The concept of "sparse 
n-grams" as introduced by Applicants is unique to this patent application. The gaps 
between words are limited by establishing a distance d which is the maximmn 
separation between the first and last words of the n-gram. This tactic is partial 
compensation for the variability allowed by natural language in expressing phrases. 
For example, one can say "input documents", or one might say "input text 
documents". The method described would generate an n-gram "input documents" 
from both of these. Qn the preferred embodiment, words are reduced to stems, so the 
actual n-gram generated would be "input document".) The most frequent n-grams 
occurring in a document, up to some number w, are used as the keyword index for the 
docimient. 

Figure 1 is a flow diagram of the automated set up procedure according to the 
invention. A program implementing a Web crawler is invoked in ftmction block 11, 
beginning at the home page of the site for which a natural language interface is to be 
generated. The output of this module is a file of Web pages in HyperText Markup 
Language (HTML). Li ftmction block 12, the Uniform Resoxirce Locators (URLs) of 
the Web pages are processed to induce a hierarchy of topics for the site and the 
HTML formatted pages are converted to the appropriate standard format. Li a 
preferred implementation of the invention, the standard format is extensible Markup 
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Language (XML). In function block 13, sparse n-grams are extracted from each page 
to serve as index terms for the page. The index terms are used to set up an answer 
generator (search engine) for the page in function block 14. In function block 15, a set 
of sparse n-grams is generated for each of the topics found in function block 12 by 
grouping together all the docimients having that topic. Those n-grams satisfying some 
criterion for significant association with the topic are saved. In a preferred 
implementation of the invention, the criterion used is the chi-square measure. The 
sparse n-grams are converted to rules in which each term of the n-gram is a term in 
the rule, and the topic is the rule consequent, in function block 16. Optionally, another 
statistical test can be made to associate a confidence measure with each rule. In the 
preferred implementation of the invention, the confidence measure is the percentage 
of time the underlying n-gram occurs in the topic. Once the preceding steps have been 
accomplished, all the necessary data is at hand to finish setting up the natural 
language interface in function block 17. 

Figure 2 shows the components of the system and their inter-relationships. 
These include the Web crawler module 21 which begins at some designated home 
page(s) and systematically finds all the pages reachable from these initial pages, 
recursively. Using the URLs of these pages, module 22 finds the topic hierarchy of 
this site. Note that there might be more than one root (i.e., initial home page) resulting 
in more than one rooted tree (hierarchy). If there is more than one rooted tree, then the 
final hierarchy is just 

top 

rooti • • • root„ 

with new top node "Top„. Module 23 uses the extracted pages along with the 
hierarchy to find key words and sparse phrases which can serve as index terms for the 
respective pages. Module 24 is an optional module for manual review and change of 
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the decisions made by the automated system. Module 25 is a rules generating module 
which generates rules for each of the topics identified by module 22. Module 25 also 
uses the documents generated by the Web crawler module 21. The rules generated by 
module 25 may optionally be edited manually, as indicated by the interface between 
modules 24 and 25. Module 26 is the interface builder system which uses the outputs 
of modules 23, 25 and, optionally, 24. 
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VI. Issues 

The sole issue presented on appeal is whether claims 1 to 6 are unpatentable 
over U.S. Patent No. 6,31 1,182 to Colbath et al. in view of U.S. Patent No. 5,819,220 
to Sarukkai et al. under the objective standards of 35 U.S.C. § 103(a). 
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vn. Grouping of Claims 

Group 1 includes claims 1 to 3 and 6. Group 2 includes claims 4 and 5. The 
claims in the two groups are distinct in that the claims of Group 1 include the step of 
defining a hierarchy of topics whereas the claims of Group 2 include the step of 
automatically inducing a topic hierarchy. 

The claims do not stand or fall together. Reasons as to why the grouped claims 
are separately patentable are included in the arguments. 
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Argument vmA, Rejections Under 35 U.S.C. §1 12, first paragraph 
There are no rejections under 35 U.S.C. §112, first paragraph. 
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Argument vmB. Rejections Under 35 U.S.C. §1 12, second paragraph 
There are no rejections under 35 U.S.C. § 1 12, second paragraph. 
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Argument vmc. Rejections Under 35 U.S.C. §102 
There are no rejections under 35 U.S.C. §102. 
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Argument vmD. Rejections Under 35 U.S.C. §103 

The Examiner alleges that Colbath et al. teach "An automated method for 
setting up a natural language interface in a Web site", but as will be demonstrated 
below, this is not correct. The Examiner further alleges that Colbath et al. teaches the 
steps of "defining" and "generating" as recited in independent claim 1, but again as 
will be demonstrated below, this is also not correct. The Examiner states that 
"Colbath does not explicitly teach, *for each topic in the hierarchy, a set of n-grams to 
a topic in the topic hierarchy which set of n-grams is distinctive to the topic and 
wherein the n-grams maybe sparse or non-sparse n-grams" (emphasis added). It is 
noted here that Colbath et al. neither explicitly nor implicitly teach this feature. The 
Examiner relies on Sarukkai et al. for a teaching of this feature, citing colximn 7, line 
27, to column 8, line 1 1, and colunm 10, lines 16 to 24, of Sarukkai et al. However, 
Sarukkai et al. neither shows nor suggests this feature. In fact, as noted in the 
Summary of the Invention, the notion of "sparse n-grams" is unique to the claimed 
invention and, furthermore, the application of n-grams as described in the subject 
apphcation is unique to the claimed invention. 

Considering first, the patent to Colbath et al, Colbath et al. teach a very 
different technology fi-om that of the claimed invention; specifically, a voice-activated 
Web browser. In Colbath et al., voice signals are recognized and converted into 
words. These words are used to form a search string, and a search is then performed, 
for example, on the Intemet or on a Web site. The search is performed over a 
preselected collection of areas of interest. Colbath et al. further disclose methods for 
searching when the search terms do not match with any preselected areas of interest. 

Colbath et al. is very different fi-om the claimed invention for several reasons. 
First, the claimed invention is directed to a method for setting up a Web site query 
interface, and Colbath et al., by contrast, is directed towards searching based on voice 
commands. Colbath et al. do not teach setting up a Web query interface, as alleged by 
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the Examiner, Second, as recognized by the Examiner, Colbath et al. do not teach the 
step of, for each Web site topic, associating a set of n-grams to the topic, which are 
distinctive of that topic, as recited in the third step of claim 1 . In the preferred 
embodiment, these sets of n-grams are converted to classification rules, and claim 6, 
dependent on claim 1, recites this step. 

Colbath et al. do not teach or suggest an automatic method for setting up a 
Web query interface, as alleged by the Examiner. In fact, Colbath et al. is completely 
lacking any suggestion to set up a query interface. Instead, Colbath et al. teaches only 
methods for conducting Web searches using voice commands. 

By comparison, independent claim 1 and dependent claim 3 are directed to 
"setting up a natural language interface in a Web site". Setting up a natural language 
interface according to the present invention requires that documents on a Web site are 
classified, and requires that a keyword index is created for documents in the Web site. 
This allows a person creating the natural language interface to do so efficiently and 
easily. The natural language interface allows a search engine to find documents on a 
Web site set up according to the invention. Colbath et al., do not teach how to create 
or set up a natural language interface, but instead teach how to perform a search using 
voice commands. Setting up a natural language interface and performing a search are 
two entirely different and distinct fimctions. Setting up a natural language interface 
allows a search program to search a Web site according to a query protocol (possibly 
specified by the interface), and performing a search finds documents of interest. 
Hence, the teachings of Colbath et al. are not really applicable to the claimed 
invention. 

Specifically, because Colbath et al. do not teach setting up a natural language 
interface, and instead teach performing a search, they necessarily lack, contrary to the 
Examiner's allegation, the essential step of "generating a keyword index for those 
documents", as recited in claim 1. The Examiner argues that Colbath et al. teach this 
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limitation in col. 3, lines 1-12. However, in this passage, Colbath et al. explain 
something quite different; specifically, that it is the "most probable word strings" of 
the input speech that are searched for. By comparison, in the claimed invention, the 
above-referenced limitation requires that a keyword index is created for a collection 
of documents so that the documents can be searched more effectively. The keyword 
index of the present invention allows a search engine to find documents; the keyword 
index is not searched for, as required by Colbath et al. Instead, the keyword index of 
the present invention represents a field searched in. The Examiner has confiised the 
search terms with the search field in the Colbath et al. reference. Hence, the teachings 
of Colbath et al. do not include or suggest generating a keyword index as in the 
present invention. 

Also, as noted above, Colbath et al. does not teach a mechanism for 
associating a rule to a topic, as required by claim 1. The claimed invention, and in 
particular the third element of claim 1, is not concerned with speech recognition 
(although it may be compatible with speech recognition). The third element of claim 1 
requires that each topic in the topic hierarchy is associated with a set of n-grams 
which are distinctive of that topic, so that searches can be performed. 

Regarding claim 3, the Examiner argues that Colbath et al. teach a keyword 
index, and that reviewing the keyword index is also taught by Colbath et al. However, 
Colbath et al. do not teach a keyword index according to the present invention. Col. 2, 
lines 20-35, of Colbath et al., identified by the Examiner with reference to claim 3, 
teaches that key words are searched for by providing them to a search engine. Col. 2, 
lines 20-35, does not teach a keyword index as in the present invention, wherein the 
keyword index is created fi-om Web pages and is a field searched in. Hence, Colbath 
et al. do not meet the limitations of claim 3. 

Regarding claim 4, Colbath et al. do not teach "creating rules firom the sparse 
n-grams, wherein each topic has associated rules that are used to decide if a new input 



19 



document or query references the topic". This is because Colbath et al. do not teach a 
natural language interface, and Colbath et al. do not teach that topics have associated 
rules. Colbath et al. teach only a voice activated search or Web browser, as explained 
above. The above-quoted limitation from claim 4 requires that Web pages or 
docimients be classified into a topic hierarchy so that they may be searched according 
to the present invention. Colbath et al. do not teach setting up topics or classifying 
data so that it can be searched, and hence do not meet this limitation of claim 4. 

Claims 1 to 3 and 6 of Group 1 include the step of defining a hierarchy of 
topics, whereas claims 4 and 5 of Group 2 include the step of automatically inducing 
a topic hierarchy. As described on page 5 of the specification, module 22 finds the 
topic hierarchy of a site using the URLs of the pages found by the Web crawler 
module 21. 

Sarukkai et al. do teach the use of n-gram language models. However, the 
teachings of Sarukkai et al. are not applicable to the claimed invention because they 
are not directed toward the set-up of a natural language interface. Sarukkai et al. 
instead teach methods for dynamically altering language models according to word 
sets in the documents searched. In other words, the language model is adjusted in 
response to documents found in a search. The n-grams used by Sarukkai et al. are 
used for speech recognition, as known in the art. For example, Sarukkai et al. teach 
smoothing or re-estimating "n-gram language model scores...'' (col. 9, lines 20-21, 
emphasis added), thereby implying that the n-grams are used for speech recognition. 
N-grams are extremely well known in the art of speech recognition. By comparison, 
the n-grams employed in the present invention are created from documents to be 
searched, and the n-grams are stored as an index for searching. Hence, the n-grams in 
the present invention are used for very different purposes compared to the n-grams of 
Sarukkai et al. Consequently, the n-grams of Sarukkai et al. cannot reasonably be 
combined with Colbath et al. to meet the limitations of claims 2 or 4, as the Examiner 
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argues. 

Much of the confusion on the part of the examiner comes from two sources: 
(1) the failure to distinguish the field of speech/voice recognition and 
generation/synthesis from text-based natural language processing, e.g, as 
ubiquitous in search applications and (2) failure to distinguish a method for setting up 
a system, as in the current invention, from the systems themselves. Beyond that, in the 
two patents referred to and the other references, there is no mention of automated 
methods for setting up any system let alone a Web-based natural language interface. 

To review the claimed invention, the basic set up is the following: 

1. The system implicit in the invention, to which the automated set up methods 
pertain, requires a taxonomy of topics for a collection of documents, assumed 
to be associated with URLs, and a set of classification rules for each topic. 
The classification rules are used to classify user queries into topics as 
described in the now issued patent No. 6,567,805, cited as patent application 
Serial No. 09/570,788 in the cross-reference to related applications on page 1 
of the specification. 

2. The claimed invention specifies how to induce a taxonomy from a set of URLs 
and their associated documents and then a set of classification rules for the 
nodes in the taxonomy. 

3. The method consists of (i) crawling a particular Web site, producing a set of 
Web pages (the documents to be associated with a taxonomy); (ii) using the 
structure of the URLs as the structure of the hierarchy; (iii) extracting from 
individual docmnents and from groups of docimients, so-called sparse n- 
grams, each of which is characteristic of a document or group of documents, 
where each group is associated with a node in the taxonomy; (iv) determining 
which phrases, whether sparse or not, are characteristic of the document or 
group of documents by some statistical technique for identifying salient 
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collocations; and (v) converting the so-called sparse n-grams to classification 
rules for use in a classifier as described in patent No. 6,567,805 (cross- 
referenced as application Serial No. 09/570,788). 
Note that the term "sparse n-gram" as defined and used in the disclosed and claimed 
invention, are sequences of tokens or words fi*om the text where the tokens or words 
may or may not have other words between them. Perhaps the term "sparse n-gram" 
has confiised the Examiner into thinking that die n-grams as used in art of 
speech/voice recognition is relevant to the claimed invention. However, both the 
specification as filed and the foregoing explanation have made clear that the claimed 
invention is using the concept of n-grams in a different way than used in the art of 
speech/voice recognition. All that is meant is the more generic notion of a set (or 
sequence) of not necessarily adjacent tokens or words in the text. So for instance, in a 
dociunent about mortgage loan applications, which has the phrase "mortgage loan 
application" as distinctive, one would presxmiably identify the phrase "mortgage loan" 
or even the noncontiguous phrase "mortgage application" as characteristic of the 
document. An alternative description would be "sparse phrases", and if this helps the 
Examiner to better understand the disclosed and claimed invention, he is invited to 
substitute that description for the term "sparse n-gram". Note also that there are two 
subcases of determining distinctive collocations (sparse phrases, sparse n-grams): 
those distinctive of a single document and those distinctive of a group of documents. 
Many methods for doing this are well understood in the art and which is used is not 
material to the general idea of the disclosed and claimed invention. 

While at least one of the cited references mentions crawling the Web as 
part of a search engine, the use to which the crawling of the Web is put is 
entirely different. None of the literature or patents cited touch on the items above. 
Specifically, none of them mention using a taxonomy of topics let alone inducing a 
taxonomy. As the current invention is not about the specific use of the taxonomy or 
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classification rules (this is covered in patent No. 6,567,805 cross-referenced as patent 
application Serial No. 09/570,788) and none of the cited references or patents mention 
this, it can be seen that they do not say anything relevant about this key part of the 
invention. 

None of the literature or patents cited mention using so-called sparse n-grams 
in the manner used in the current invention, namely, in conjunction with docimients 
and groups of documents associated with nodes or topics in an (induced) hierarchy to 
identify collocations or phrases that are characteristic of the associated docimient or 
group of dociunents. 

None of the literature or patents cited mention converting sparse n-grams or 
collocations into classification rules, whose use is described in the context of a 
classification-based natural language interface for the Web in patent No. 6,567,805 
(cross-referenced as application Serial No. 09/570,788). 

It follows from this that none of the cited literature or patents deal in any way 
with the combination of these methods nor is such combination impUcit in the cited 
literature or patents singly or in combination. It certainly cannot be reasonably 
maintained, when this is understood, that the claimed invention is anticipated or made 
obvious by the references or combination of references. Nor can it be reasonably 
maintained that the claimed invention is an obvious extension or alteration of what is 
taught in the references. 

Briefly summarizing, Colbath et al. deals with a speech or voice interface that 
involves simple key word matching against a database of topics or microdomains and 
associated predefined keywords or phrases. Colbath et al. does not discuss setting up a 
taxonomy or hierarchy of topics, let alone one induced from a set of URLs. Nor does 
Colbath et al. discuss or mention building a set of classification rules from the content 
associate with a taxonomy or topic hierarchy induced from a set of documents 
associated with URLs. Colbath et al. does not discuss how one identifies the topics or 
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micro-domains nor how to establish the predefined phrases. In contrast, the claimed 
invention deals exclusively with a method for inducing or automatically setting up a 
taxonomy of topics (Sarukkai et al. are silent on the matter of hierarchically structured 
taxonomies) and with automatically inducing phrases or sparse n-grams distinctive of 
documents or groups of document associated with nodes or topics in the automatically 
induced taxonomy. So, the claimed invention and Colbath et al. treats entirely 
different topics, 

Sarukkai et al. deals with a voice activated browser. In large part, Sarukkai et 
al. deals with how to overcome problems with speech recognition algorithms when 
there are words that are "out of vocabulary". Instead of employing a rewriting style 
grammar, which is non-probabilistic and very rigid, Sarukkai et al. employs n-grams. 
But n-grams also have the problem that they are statically trained on a given corpora 
and the Web will always have many words not in the training corpus, which means 
the speech recognition system. The claimed invention deals with dynamically altering 
scores of the statistical language model and acoustic model used in speech recognition 
systems. Sarukkai et al. simply does not deal with any of the topics addressed in the 
disclosed and claimed invention. The common use of the term n-gram, which at a 
technical level are quite distinct, as for Sarukkai et al., "n-gram" means a sequence of 
tokens that are assigned probabilities within the context of a speech recognition 
system language model, is irrelevant to the claimed invention. Many systems use 
common technologies, but even here the details of usage are very different. One 
cannot reasonably maintain that Sarukkai et al. anticipates or teaches any features the 
claimed invention. Nor can anyone maintain with reason that the combination of 
Sarukkai et al. and Colbath et al. provides what is claimed as neither one treats any of 
the key items listed above. 

In the passages bridging pages 7 and 8 of the Office Action mailed July 31, 
2003, Paper No. 6, the Examiner responds to the Applicants arguments with citations 
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to Schering Corp. v. Geneva Pharmaceuticals Inc., 64 USPQ2d 1032 (DC NJ 2002), 
decided August 8, 2002, and to MPEP 2144.01 pertaining to implicit disclosure. 
Neither citation is apposite to the case before this Board. 

The case of Schering Corp. v. Geneva Pharmaceuticals Inc., 64 USPQ2d 
1032 (U.S. District Court District of New Jersey) is an unpubhshed decision brought 
on a motion for summary judgment. As such, it is not competent legal precedent. 
Moreover, the issues considered by the District Court were not issues arising under 35 
U.S.C. §103 but, rather, 35 U.S.C. §102. The quotation repeated with emphasis by the 
Examiner appears at page 1038 of the PQ citation. Even though the Schering case is 
not competent legal precedent, it is instructive to review the case (particularly since 
the Examiner relies on it) to see what the facts were and the basis for the Court's 
decision. 

In this case. Plaintiff Schering Corporation is a pharmaceutical company and 
is the sole owner of U.S. Patent No. 4,282,233, which covers a drug known as 
loratadine. This patent was set to expire on June 19, 2002, but Plaintiff obtained a 
six-month extension of protection to December 19, 2002. Plaintiff markets its 
loratadine product under the brand name Claritin. Plaintiff Schering is also the sole 
owner of the central patent at issue, U.S. Patent No. 4,659,716, which covers 
"DCL"(DesCarboethoxylLoratadine), one of the metabolites of loratadine. During the 
course of its preclinical studies on loratadine, Schering identified DCL as an active 
metaboHte of loratadine in experiments with laboratory animals and in clinical studies 
on humans. The '716 patent will expire in February 2004. It was imdisputed that the 
'233 patent was issued more than one year before Schering filed the first application 
leading to the '716 patent; therefore, the '233 patent constitutes "prior art" to the '716 
patent under 35 U.S.C. 102(b). 

Defendants are pharmaceutical companies who seek to manufacture a generic 
version of Claritin as soon as the '233 patent's protection expires. It is 
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well-established that use of patented inventions solely to develop a generic drug for 
purposes of FDA approval does not constitute an infringement of the patent. 
However, in order for a pharmaceutical company to obtain FDA approval for a drug, 
the company must review "The Orange Book" which lists FDA-approved drugs and 
patents related thereto, and then make one of the following four certifications with 
respect to its Abbreviated New Drug Application ("AND A"): (1) no patent 
information regarding the new drug sought to be approved has been filed; (2) such 
patent has expired; (3) the applicant will use the drug after the date on which such 
patent will expire; or (4) such patent is invahd or will not be infringed by the 
manufacture, use or sale of the new drug. If a pharmaceutical company seeking to 
manufacture a generic drug files an ANDA with a paragraph 4 certification that the 
patent is "invalid or will not be infringed," the generic company must notify the 
patent owner of such filing and provide the factual or legal basis for the applicant's 
assertion that the patent is invalid or will not be infiinged. That certification itself 
gives the owner of the patent a statutory cause of action to sue the generic company 
for infiingement even though the generic company has not actually manufactured, 
used or sold the patented drug. Once the patent owner is notified, he has 45 days to 
file suit against the generic company for the infiingement. If a suit is not filed within 
45 days, the FDA may issue an approval for the generic drug. However, if a suit is 
filed within the allotted time, the approval may be made effective upon the expiration 
of a 30-month waiting period or such shorter or longer period as a court may order 
because either party to the action failed to reasonably cooperate in expediting the 
action. Once an FDA approval for the generic drug is granted, only monetary 
remedies are available to the patentee, and no injunctive rehef shall issue. 

Defendants, when they sought to manufacture the generic version of Claritin 
upon the expiration of the '233 patent, were made aware of the *716 patent, also listed 
in The Orange Book. The Orange Book listing of the '716 patent together with the 
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*233 patent forced Defendants to select one of the above-noted certifications 
regarding both patents. Because Plaintiff listed the '716 patent under the Claritin 
(*233) entry. Defendants submitted a paragraph 4 certification. 

On the Summary Judgment motion, the Court found that Claims 1 and 3 of the 
'716 patent were "inherently" anticipated by the *233 patent. Li arriving at this 
decision, the Court accepted the parties' position that claims 1 and 3 of the *7 16 
patent covered DCL in any form — whether metabolized within the human body or 
synthetically produced in a purified and isolated form. The Court explained what it 
meant by "inherent" anticipation by citation to 35 U.S.C. §102(b) which states in 
pertinent part: 

"A person shall be entitled to a patent unless — (b) the invention was 
patented or described in a printed publication in this or a foreign 
country, ... more than one year prior to the date of the application for 
patent in the United States ..." 

The Court determined that section 102(b), as applied to the undisputed material facts 
of this case, invalidates Claims 1 and 3 of the *716 patent, since, the subject of the 
'716 patent (DCL in either pure or metabolic form) was "described in a printed 
publication" (the '233 patent) "more than one year prior to the date of the application 
for [the '716] patent." The Court went on to say that to serve as an anticipation when 
the reference is silent about the asserted inherent characteristic, such gap in the 
reference may be filled with recourse to extrinsic evidence. Such evidence must make 
clear that the missing descriptive matter is necessarily present in the thing described 
in the reference, and that it would be so recognized by persons of ordinary skill. In re 
Oekich, 666 F.2d 578, 581, 212 U.S.P.Q. 323, 326 (CCPA 1981) (quoting Hansgirg 
V. Kemmer, 102 F.2d 212, 214, 40 U.S.P.Q. 665, 667 (CCPA 1939)) provides: 

"Inherency, however, may not be established by probabilities or 
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possibilities. The mere fact that a certain thing may resuh from a given 
set of circumstances is not sufficient. [Citations omitted.] If, however, 
the disclosure is sufficient to show that the natural result flowing fi^om 
the operation as taught would result in the performance of the 
questioned function, it seems to be well settled that the disclosure 
should be regarded as sufficient. Continental Can Co., 948 F.2d at 
1268-69." 



The Court stated that to establish inherency, the extrinsic evidence must make 
clear that the missing descriptive matter is necessarily present in the thing described 
in the reference, and that it would be so recognized by persons of ordinary skill. 
Inherency is not necessarily coterminous with the knowledge of those of ordinary skill 
in the art. Artisans of ordinary skill may not recognize the inherent characteristics or 
functioning of the prior art. However, the discovery of a previously imappreciated 
property of a prior art composition, or of a scientific explanation for the prior art's 
functioning, does not render the old composition patentably new to the discoverer. 
Insufficient prior imderstanding of the inherent properties of a known composition 
does not defeat a finding of anticipation, citing Robert Harmon, Patents and the 
Federal Circuit §3.2(b), at 88 (2001). 

The Court went on to say that the Federal Circuit cases establish that 
knowledge or appreciation of that which anticipates need not be contemporaneous 
with the application for or issuance of the patent under scrutiny. As the Federal 
Circuit stated recently: 



"Thus, a prior art reference may anticipate when the claim limitation or 
limitations not expressly found in that reference are nonetheless 
inherent in it. See In re Oelrich, 666 F.2d at 581; Verdegaal Bros,, Inc. 
V. Union Oil Co, ofCaL, 814 F.2d 628, 630, 2 USPQ2d 1051, 1053 
(Fed. Cir. 1987). Under the principles of inherency, if the prior art 
necessarily functions in accordance with, or includes, the claimed 
limitations, it anticipates. See/w re King, 801 F.2d 1324, 1326, 231 
USPQ 136, 138(Fed. Cir. 1986). Inherency is not necessarily 
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coterminous with the knowledge of those of ordinary skill in the art. 
Artisans of ordinary skill may not recognize the inherent 
characteristics or functioning of the prior art. See id.^ 801 F.2d at 1326. 
Mehl/Biophile International Corp, v. Milgraum, M.DS., 192 F.3d 
1362, 1365 [52 USPQ2d 1303] (Fed. Cir. 1999). As noted from the 
references in Mehl, this is not a new doctrine See also Titanium Metals 
Corp, V. Banner, 778 F.2d 775 [227 USPQ 773] (Fed. Cir. 1985). 
More recent and extensive treatments of this significant feature of the 
doctrine of inherent anticipation are found in EMI Group North 
America, Inc, v. Cypress Semiconductor Corp,, 268 F.3d 1342 [60 
USPQ2d 1423] (Fed. Cir. 2001), and Atlas Powder Co, v. Irecolnc, 
190 F.3d 1342 [51 USPQ2d 1943] (Fed. Cir. 1999): 

"Inherency is not necessarily coterminous with the 
knowledge of those or ordinary skill in the art.... 
Artisans of ordinary skill may not recognize the 
inherent characteristics or functioning of the prior art.... 
However, the discovery of a previously unappreciated 
property of a prior art composition, or of a scientific 
explanation for the prior art's functioning does not 
render the old composition patentably new to the 
discoverer." 

Applying these principles, the Court determined in the case at bar that the 
natural, inevitable production of metabolic DCL upon human ingestion of loratadine, 
although not fully appreciated by persons of ordinary skill in that field until more 
recently than 1984, demonstrates that this process is an "inherent characteristic or 
functioning"of the use of loratadine, the subject of the *233 patent. Therefore, that 
patent inherently anticipates Claims 1 and 3 of the '716 patent, rendering them 
invalid. 

The Examiner also cites MPEP 2144.01, relating to implicit disclosure, but 
this citation is also inapposite. The case of In re Preda, 401 F.2d 825, 159 USPQ 342 
(CCPA 1968), like the Schering case considers the question of anticipation under 35 
U.S.C. §102, not obviousness under 35 U.S.C. §103, but imlike the Schering case, the 
Preda case is competent legal precedent. This appeal was from the decision of the 
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Patent Office Board of Appeals affirming the rejection of claims 7 and 8 of 
application serial No. 269,707, filed April 1, 1963, entitled "Process for Catalytically 
Producing Carbon Disulphide From Sulphur and Gaseous Hydrocarbons." The 
invention relates to a process for producing carbon disulfide from sulfiir vapors and a 
gaseous hydrocarbon using charcoal as a catalyst. The precise invention before the 
court is defined by the two claims on appeal: 

7. A process for producing carbon disulfide from sulfiir vapor and 
gaseous hydrocarbon which comprises reacting the sulfiir vapor and 
gaseous hydrocarbon in contact with charcoal, as a catalyst, at a 
temperature of about 750'-830''C. and a space velocity of 120-1400. 

8. A process according to claim 7 wherein the hydrocarbon is methane. 

The reference relied on was a book, Thacker and Miller, Industrial and 
Engineering Chemistry^ Vol. 36, No. 2, February, 1944, pp. 182-184. Thacker 
disclosed the results of investigations conducted to develop catalysts that bring about 
high rates of reaction between methane and sulfiir vapors (to produce carbon 
disulfide) at 700°C. and below. One catalyst found to perform satisfactorily at these 
temperatures was "activated charcoal." 

The Board, in affirming the Examiner, stated "[The] temperature limitation, 
which is the only limitation presenting a possibility of distinction, in our mind, 
appears to be met in the discussion in the first column of page 182, and Fig. 1 [of 
Thacker]." The Court, in affirming the Board's decision, noted that one of the 
relevant portions of column 1 on page 182 of Thacker reads as follows: 

"De Simo * * * recently patented a catalytic process for converting 
methane with sulfiir vapors into carbon disulfide at 800' to lOOO'C. 1 

In this column Thacker also describes the temperatures used in his investigations, i.e., 
700''C. and below, as being "much lower than had previously proved feasible * * 
Fig. 1 shows the theoretical conversions, at temperatures of from about 400* to 
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somewhat above 750'C., for the reaction of hydrogen sulfide with methane and for 
the six reactions of sulfur with methane that the authors considered the most likely to 
occur under the conditions of their investigations. 

The Court noted that Figure 1 of Thacker, by itself, does not disclose every 
limitation in the appealed claims. However, in considering the disclosure of a 
reference, it is proper to take into account not only specific teachings of the reference 
but also the inferences which one skilled in the art would reasonably be expected to 
draw therefi-om. In re Shepard, 50 CCPA 1439, 319 R2d 194, 138 USPQ 148 (1963). 
In this regard, the Court noted that the above quoted reference in colximn 1 to "a 
catalytic process for converting methane with sulfur vapors into carbon disulfide at 
800' to lOOO'C," the statement in column 1 that the temperatures used in Thacker's 
investigations were "much lower than had previously proved feasible for reactions of 
methane with sulfur," and the recognition in Table I and Figure 1 that methane and 
sulfur could be reacted at temperatures above those used by Thacker. This convinced 
the Coxirt that, although Thacker does not expressly state that carbon disulfide could 
be produced by reacting methane and sulfur in the presence of activated charcoal as a 
catalyst at temperatures within the range recited in the appealed claims, there would 
be a recognition of this fact from a consideration of Thacker by one skilled in the art. 

In this column Thacker also describes the temperatures used in his 
investigations, i.e., 700'C. and below, as being "much lower than had previously 
proved feasible * * Fig. 1 shows the theoretical conversions, at temperatures of 
from about 400° to somewhat above 750°C., for the reaction of hydrogen sulfide with 
methane and for the six reactions of sulfur with methane that the authors considered 
the most likely to occur under the conditions of their investigations. 

In other words, reading the reference as a whole, taking into consideration all 
that it teaches, what is implicitly taught is anticipatory. But it does not follow that 
what is not taught is implicit. There must be a factual basis for what is said to be 
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implicit in the reference, as was clearly the case in the facts before the Court in In re 
Preda, 

The basic considerations which apply to obviousness rejections are set out in 
MPEP2141: 

"When applying 35 U.S.C. 103, the following tenets of patent law must be 
adhered to: 

"(A) The claimed invention must be considered as a whole; 

"(B) The references must be considered as a whole and must suggest the 
desirability and thus the obviousness of making the combination; 

"(C) The references must be viewed without the benefit of impermissible 
hindsight vision afforded by the claimed invention; and 

"(D) Reasonable expectation of success is the standard with which 
obviousness is determined." 

This is the correct standard. It is an objective standard. It does not support the 
possibility of using hindsight to reconstruct the combination of references, filling in 
the blanks with allegations of "inherency" and "implicitness". The Examiner's 
rejection of the claims is in error as a matter of law. To establish inherency, the 
extrinsic evidence must make clear that the missing descriptive matter is necessarily 
present in the thing described in the reference, and that it would be so recognized by 
persons of ordinary skill. The Examiner has not pointed to any such extrinsic 
evidence. To show that something is implicit, the reference must be a factual basis in 
the reference to support it. The Examiner has not demonstrated any such factual basis. 
Rather, the Examiner has engaged in impermissible hindsight, ignoring what is 
specifically claimed and what the references as a whole teach, by combining the 
references to Colbath et al. and Sarukkai et al, filling in the blanks with allegations of 
"inherency" and "implicitness". 

As a matter of law, the Examiner's rejection should be reversed. 
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Argument vmE. Rejection Other Than 35 U.S.C. §§102, 103 and 1 12 
There are no other rejections of the claims. 
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IX. Appendix of Claims Involved in the Appeal (37 C.F.R. §1.1 92(c)(9)) 
The text of the claims involved in the appeal are as follows: 



1 1 . An automated method for setting up a natural language interface in a Web 

2 site comprising the steps of: 

3 defining a hierarchy of topics into which individual documents or Web 

4 pages can be classified; 

5 generating a keyword index for those documents; and 

6 for each topic in the hierarchy, associating a set of n-grams to a topic 

7 in the topic hierarchy, which set of n-grams is distinctive to that topic and 

8 wherein the n-grams maybe sparse or non-sparse n-grams. 

1 2. The automated method for setting up a natural language interface in a Web 

2 site recited in claim 1, wherein the step of generating a keyword index 

3 comprises the step of extracting sparse n-grams of keywords for each group of 

4 pages in the topic hierarchy. 

1 3. The automated method for setting up a natural language interface in a Web 

2 site recited in claim 1, further comprising the step of optionally reviewing and 

3 editing the keyword index. 

1 4. An automated method for setting up a natural language interface in a Web 

2 site comprising the steps of: 

3 automatically inducing a topic hierarchy by examining a structure of 

4 the Web site; 

5 creating n-grams fi*om pages in the Web site that are associated with a 

6 topic in the topic hierarchy wherein the n-grams may be sparse in-grams or 
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7 non-sparse n-grams; and 

8 creating rules from the n-grams, wherein each topic has associated 

9 rules that are used to decide if a new input document or query references the 
10 topic. 

1 5. The automated method for setting up a natural language interface in a Web 

2 site recited in claim 4, wherein the step of creating rules is performed 

3 automatically and further comprising the optional step of manually editing the 

4 rules. 

1 6. The automated method for setting up a natural language interface in a Web 

2 site recited in claim 1 , further comprising the step of converting the set of n- 

3 grams to classification rules. 
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X. Other Materials that Appellant Considers Necessary or Desirable 

There are not other materials necessary or desirable for consideration of this 
Appeal. 



Respectfully submitted, 
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